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Abstract 

In this article, for the first time, one develops a nonparametric methodology for an analysis of shapes of confi- 
gurations of landmarks on real 3D objects from regular camera photographs, thus making 3D shape analysis very 
accessible. A fundamental result in computer vision by Faugeras (1992), Hartley, Gupta and Chang (1992) is that 
generically, a finite 3D configuration of points can be retrieved up to a projective transformation, from corresponding 
configurations in a pair of camera images. Consequently, the projective shape of a 3D configuration can be retrieved 
from two of its planar views. Given the inherent registration errors, the 3D projective shape can be estimated from 
a sample of photos of the scene containing that configuration. Projective shapes are here regarded as points on 
projective shape manifolds. Using large sample and nonparametric bootstrap methodology for extrinsic means on 
manifolds, one gives confidence regions and tests for the mean projective shape of a 3D configuration from its 2D 
camera images. 
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1 Introduction 

Until now, statistical analysis of similarity shapes from images was restricted to a small amount of data, since simila- 
rity shape appearance is relative to the camera position with respect to the scene pictured. 

In this paper, for the first time, we study the shape of a 3D configuration from its 2D images in photographs of the 
configuration, without any camera positioning restriction relative to the scene pictured. Our nonparametric methodo- 
logy is manifold based, and uses standard reconstruction methods in computer vision. 

In absence of occlusions, a set of point correspondences in two views can be used to retrieve the 3D configuration of 
points. Faugeras (1992) and Hartley et. al. (1992) state that two such reconstructions differ by a projective transfor- 
mation in 3D. Sughatadasa (2006) and Patrangenaru and Sughatadasa (2006) noticed that actually the object which 
is recovered without ambiguity is the projective shape of the configuration, which casts a new light on the role of 
projective shape in the identification of a spatial configuration. 

Projective shape is the natural approach to shape analysis from digital images, since the vast majority of libraries of 
images are acquired via a central projection from the scene pictured to the black box recording plane. Hartley and 
Zisserman (2004, p.l) note that "this often rends classical shape analysis of a spatial scene impossible, since similarity 
is not preserved when a camera is moving." 

Advances in statistical analysis of projective shape have been slowed down due to overemphasis on the importance of 
similarity shape in image analysis, with little focus on the principles of image acquisition or binocular vision. Progress 
was also affected by lack of a geometric model for the space of projective shapes, and ultimately probably by insuffi- 
cient dialogue between researchers in geometry,computer vision and statistical shape analysis. 

For reasons presented above, projective shapes have been studied only recently, and except for one concrete 3D exam- 
ple due to Sughatadasa(2006), to be found in Liu et al. (2007), the literature was bound to linear or planar projective 
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shape analyzes. Examples of 2D projective shape analysis can be found in Maybank (1994), Mardia et. al. (1996), 
Goodall and Mardia (1999), Patrangenaru (2001), Lee et. al. (2004), Paige et. al. (2005), Mardia and Patrangenaru 
(2005), Kent and Mardia (2006, 2007) and Munk et. al. (2007). 

Our main goal here is to derive a natural concept of 3D shape that can be extracted from data recorded from camera 
images. The statistical methodology for estimation of a mean 3D projective shape is nonparametric, based on large 
sample theory and on Efron's bootstrap ( Efron (1979, 1982)). In this paper, a 3D projective shape is regarded as a 
random object on a projective shape space. Since typically samples of images are small, in order to estimate the mean 
projective shape we use nonparametric bootstrap for the studentized sample mean projective shape on a manifold, 
as shown in Bhattacharya and Patrangenaru (2005). This bootstrap distribution was essentially presented in Mardia 
and Patrangenaru (2005). Since, while running the projective shape estimation algorithm in Mardia and Patrangenaru 
(2005) on a concrete data set, Liu et al. (2007) have found typos in some formulas. In this paper we are making the 
necessary corrections. 

In section 2 we present projective geometry concepts and facts that are needed in section 3, such as projective space, 
projective frames, and projective coordinates. We also introduce computer vision concepts, such as essential matrix 
and fundamental matrix , associated with a pair of camera views of a 3D scene that is needed in the reconstruction of 
that scene from 2D calibrated, and respectively non-calibrated camera images. We then state the Faugeras-Hartley- 
Gupta-Chang projective ambiguity theorem for the scene reconstructed from two non-calibrated camera views. For 
the reconstruction of a configuration of points in space from its views in a pair of images, we refer to a computational 
algorithm in Ma et. al. (2006). 

In Section 3 we introduce projective shapes of configurations of points in R m or in RP m , and the multivariate axial 
geometric model for the projective shape space, which is our choice for a statistical study of projective shape. The 
Faugeras-Hartley-Gupta-Chang theorem is reformulated in Theorem 13. II in terms of projective shapes: iflZ is a 3D 
reconstruction of a spatial configuration C from two of its uncalibrated camera views, then 1Z and C have the same 
projective shape. This is the key result for our projective shape analysis of spatial configurations, which opens the 
Statistical Shape Analysis door to Computer Vision and Pattern Recognition of 3D scenes. 

Since projective shape spaces are identified via projective frames with products of axial spaces, in section 4 we ap- 
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proach multivariate axial distributions via a quadratic equivariant embedding of a product of q copies of RP m in 
products of spaces of symmetric matrices. A theorem on the asymptotic distributions of extrinsic sample means of 
multivariate axes, stated without proof and with some minor typos in Mardia and Patrangenaru (2005) is given in this 
section (Theorem 14. U with a full proof. The corrections to Mardia and Patrangenaru (2005) are listed in Remark 
14.31 The asymptotic and nonparametric bootstrap distribution results are used to derive confidence regions for ex- 
trinsic mean projective shapes. If a random projective shape has a nondegenerated extrinsic covariance matrix, one 
may studentize the extrinsic sample mean to generate asymptotically chi square distributions that are useful for large 
sample confidence regions in Corollary 14.21 or nonparametric bootstrap confidence regions if the sample is small in 
Corollary |4.4l If the extrinsic covariance matrix is degenerated, and the axial marginals have nondegenerated extrinsic 
covariance matrices, one gives a Bonferroni type of argument for axial marginals to derive confidence regions for the 
mean projective shape in Corollary 14.51 



2 Basic Projective Geometry for Ideal Pinhole Camera Image Acquisition 

Pinhole camera image acquisition is based on a central projection from the 3D world to the 2D photograph. Distances 
between observed points are not proportional to distances between their corresponding points in the photograph, and 
Euclidean geometry is inappropriate to model the relationship between a 3D object and its picture, even if the object is 
flat. The natural approach to ideal pinhole camera image acquisition is via projective geometry, which also provides a 
logical justification for the mental reconstruction of a spatial scene from binocular retinal images, playing a central role 
in human vision. In this section we review some of the basics of projective geometry that are useful in understanding 
image formation and scene retrieval from ideal pinhole camera images. 

2.1 Basics of Projective Geometry 

Consider a real vector space V. Two vectors x, y € ^\{0y} are equivalent if they differ by a scalar multiple. The 
equivalence class of x S ^\{0y} is labeled [x], and the set of all such equivalence classes is the projective space 
P(V) associated with V, P(V) = {[x],x € V\O v }- The real projective space RP m is P(R m+1 ). Another notation 



4 



for a projective point p = [x] S tf m , equivalence class of x = (x 1 , ■ ■ ■ , x m+1 ) G R m+1 ,p = [x 1 : x 2 : • • • : a;" l+1 ], 
features the homogeneous coordinates (x 1 , . . . , x m+1 ) of p, which are determined up to a multiplicative constant. A 
projective point p admits also a spherical representation , when thought of as a pair of antipodal points on the m 
dimensional unit sphere, p = {z, — z}, x = (x 1 , x 2 , . . . , x m+1 ), (x 1 ) 2 + ■ • • + (x m+1 ) 2 = 1. Ad- dimensional 
projective subspace of RP m is a projective space P(V), where V is a (d + 1) -dimensional vector subspace of R m+1 . 
A codimension one projective subspace of RP m is also called hyperplane. The linear span of a subset D of MP" 1 is 
the smallest projective subspace of MP" 1 containing D. We say that k points in MP" 1 are in general position if their 
linear span is MP" 1 . If k points in RP m are in general position, then k > m + 1. 

The numerical space R m can be embedded in RP m , preserving collinearity. An example of such an ajfine embedding 
is 

(2.1) h((u\...,u m )) = [u 1 : ... : u m : 1} = [u], 

where u — (tt 1 , . . . , u m , 1) T , and in general, an affine embedding is given for any A £ Gl(m + 1, R), by Iia(u) ~ 
[Au], The complement of the range of the embedding h in (12. U is the hyperplane MP™" 1 , set of points [a; 1 : • • • : 
x m : 0] e RP m . 

Conversely, the inhomogeneous ( affine ) coordinates (u 1 , . . . ,u m ) of a point p = [x 1 : x 2 : ■ ■ ■ : x m+1 ] E 
MP m \RP m - 1 are given by 

x j 

(2.2) v? = — fI ,yj = l,...,m. 

Consider now the linear transformation from R™ 1 +1 to R m+1 defined by the matrix B £ M(m + 1, m! + 1; R) and its 
kernel K = {x E R m ' +1 , Bx = 0}. The projective map (3 : RP m '\P(K) -> RP m , associated with B is defined by 

(2.3) = [Px]. 

In particular, a projective transformation (3 of RP m is the projective map associated with a nonsingular matrix B G 
GL(m + 1,R) and its action on MP" 1 : 

(2.4) ^([x 1 : ••• :a; m+1 ]) = [P(a;\ . . . , x m+1 )]. 



In affine coordinates ( inverse of the affine embedding d2.1| )), the projective transformation j2.4j is given by v = f(u), 
with 



3 I v-> m 
a m+l ' l~ii=\ a 
m+1 , n 
"m+l T Z^i=l a i 



(2.5) ^ ^rr, ^ry +1 , -■/ = ■ - 

where det B = det((a^ )i,j=i,...,m+i) 7^ 0. An t#n<? transformation of ILP m , u = + &, A € GL(m 7 M),t e 
is a particular case of projective transformation a, associated with the matrix B 6 GL(m + 1, M), given by 



(2.6) B 



\0£ ly 

A projective frame in an m dimensional projective space ( or projective basis in computer vision literature, see e.g. 
Hartley (1993)) is an ordered set of m + 2 projective points in general position. An example of projective frame in 
IP™ is the standard projective frame ([ei], . . . , [e m +i], [e\ + ... + e m +\\). 

In projective shape analysis it is preferable to employ coordinates invariant with respect to the group PGL(m) of 
projective transformations. A projective transformation takes a projective frame to a projective frame, and its action 
on RP m is determined by its action on a projective frame, therefore if we define the projective coordinatef s) of a point 
p e RP m w.r.t. a projective frame it — (p% , . . . , p m +2) as being given by 

(2.7) P*=p- 1 (p), 

where (3 S PGL(m) is a projective transformation taking the standard projective frame to it. These coordinates have 
automatically the invariance property. 

REMARK 2.1. Assume u, u±, . . . , u m +2 ore points in R m , such that it — ([ui], . . . , [u m +2\) is a projective frame. If 
we consider the (m + 1) x (m + 1) matrix U m — [uf , . . . , u^ n+1 ], the projective coordinates of p — [u] w.r.t. it are 
given by 

(2.8) p* = [y 1 (u):---:y m+1 (u)}, 
where 

(2.9) v(u) = U m x u T 



and 



(2.10) 



y*(u)= ,f {U) Vj = l,...,m + 1. 
(it m+2 ) 



Note that in our notation, the superscripts are reserved for the components of a point, whereas the subscripts are 
for the labels of points. The projective coordinate(s) of x are given by the point : • • • : z rn+1 {x)] <E RP m . 



2.2 Projective geometry and image acquisition in ideal digital cameras. 

An introduction to the geometry pinhole camera principle can be found in 3D-Vision texts including Ma et. al. 
(2006), Hartley and Zisserman (2004) |[T3l , Birchfeld (1998) |4|, etc. In this section we give such a description in our 
projective geometry notation. Ideal pinhole camera image acquisition can be thought of in terms of a central projection 
P : RP 3 \RP 2 — > RP 2 , whose representation in conveniently selected affine coordinates (x, y, z) £ R 3 , (u, v) £ R 2 
is given by 

x 

u = f~ 



(2.11) 



v = /-, 

z 



where / is the focal length, i.e. the distance from the image sensor or film to the pinhole or principal plane of the lens 
MP 2 , which is the complement of the domain of (3 in RP 3 . In homogeneous coordinates [x : y : z : w] , [u : v : t] the 
perspective projective map (3 can be represented by the matrix B 6 M(3, 4; R) given by: 



0^ 



(2.12) 



B = 



0/00 

yo o l o 

Digital cameras image acquisition is based on a slightly different projective transformation, that in addition takes 
into account internal camera parameters such as pixel aspect ratio, skewness parameter and principal point ( origin of 
image coordinates in the principal plane). For such cameras, the projective map ( 12.12t is altered by a composition 
with a matrix accounting for camera internal calibration parameters. If we also take into consideration the change 
of coordinates between the initial and current camera position involving a roto-translation (R, t) S 50(3) x R 3 , the 



projective map of a pinhole camera image acquisition tt is associated with the matrix: 

/- - \ / 



(2.13) 



B — CmtBG — 



k u k c uq 
k v v Q 
1 



/ 













(n 


A 

















/ 
























{01 


V 








1 


V 





= AG, 



where k u and k v are scale factors of the image plane in units of the focal length /, and 9 = cc4T x k c is the skew, and 
(uq, vo) is the principal point. The matrix A contains the internal parameters and the projection map ( 12.12b . while E 
contains the external parameters. The columns of the matrix B are the columns of a 3 x 3 matrix P followed by a 
3x1 vector p: 



(2.14) 



so that 



B={pp) 



(2.15) 



P = AR and p = At. 



2.3 Essential and fundamental matrices 

Consider now two positions of a camera directed at a point [u] G KLP 3 , and the projective points associated with its 
images taken at these locations of the camera, m a = [u a ] S RP 2 , a = 1, 2, where U\, U2 £ M 3 \{0}. If we assume the 
camera's internal parameters are known ( camera is calibrated), then, with respect to the camera's coordinates frame 
at each position, we may assume C; nt = ^3 . 

Since the lines joining the two locations of the camera optical center with the image points meet at [u] , these two lines 
and the line joining the two locations of the camera optical center are coplanar. The plane containing these lines is the 
epipolar plane associated with the point [u] . 

Assume we refer all the points to one coordinate system, say the coordinate system of the second position of the 
camera. The position vectors of first and second image points are t + Ru\, respectively 112, and and the vector from 
one optical center to the other is t. Here the change of coordinates between the Euclidean frames corresponding to the 
two camera positions is given by a roto-translation (R, t) £ SO(3) x K 3 . The three vectors above are directions of 
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lines in the epipolar plane, therefore 

(2.16) iij(t x {Ru t )) = 0. 

By defining t x as the matrix associated with the linear operator y — > t x $/ we can rewrite the equation (12.161 as 
follows 

(2.17) v%{t x Rut)) = u\Eu x = 0, 
where E = t x R is the so called essential matrix. 

If the camera is uncalibrated, then the matrices A\ — A 2 = A in ( 12.15b containing the camera internal parameters, 
yield the homogeneous pixel coordinates: 

(2.18) vt = Aut 

(2.19) v 2 = Au 2 . 

Thus: 

(2.20) (A~ 1 v 2 ) T (t x RA- X vt) = vjA' 1 ^ x RA^vx) = 0, 
and we obtain 

(2.21) vjFvt = 0, 

where F = (A^ 1 ) 7 ' E A^ 1 , with E the essential matrix in (12. 1 7b is the so called fundamental matrix. The fundamental 
matrix depends only on the relative position of the two cameras, and on their internal parameters. It has rank two, 
depending on seven real constants. 

2.4 Reconstruction of a 3D scene from two of its 2D images. 

If we select conveniently the coordinates for the first camera position, and also incorporating the internal parameters, 
we may assume that the matrix associated with 0% in equations ( 12.31 l and ( 12.131 l is B\ — (I|0), and the fundamental 
matrix factors as follows : F = t x R, where B 2 — {R\t) is the matrix defining (3 2 ( see equations (12.31 > and ( 12.131 >). 



Note that here R is nonsingular, and it does not necessarily represent the matrix of a rotation. Let [vi], [vz] G M.P 2 
be given by ( I2.18l l associated with a pair [ui], [112] G M.P 2 corresponding to matched points in two images. We seek 
a point [u] G M.P 3 such that [vi] = $i[u],i — 1, 2. From the relation vjFvi = v 2 t x Rvi = v^it x Rv\) = 0, 
it follows that v%,Rv\,t are linearly dependent and we may assume that Rv\ = bv 2 — at. Moreover, since v\ is 
defined up to a scalar multiple, we may assume that Rv\ = V2 — at, and define [u] G MP 3 by u = (vf , a) T . Now 
Biu = (I\Q)u = Vi, and B 2 u — (R\t)u — Rv\ + at = v%, therefore [u] is a desired solution to the reconstruction 
problem. As shown, [u] is determined by the two camera projection matrices B>i and B 2 - If we choose a different pair 
of camera matrices B\H and B 2 H yielding the same fundamental matrix F, then, in order to preserve the same pair 
of matched image points, the point [u] must be replaced by [H . 

PROBLEM 2.1. The problem of the reconstruction of a configuration of points in 3D from two ideal uncalibrated 
camera images, is equivalent to the following: given two camera images KLPj 2 , RP 2 of unknown relative position and 
unknown internal camera parameters, and two matching sets of labelled points {p a .\, ■ ■ ■ ,Pa.k\ C K-P^ , a — 1,2, 
find a configuration of points pi , . . . , pk G RP 3 such that there exist two positions of the camera RPj 2 , RPf f or which 
Pa(Pj) =Pa, 3 ,Va = 1,2, j = l,...,k. 

The above discussion proves the following theorem (Faugeras(1992), Hartley et al.(1992)): 

THEOREM 2.2. The reconstruction problem for two non calibrated camera images has a solution in terms of the 
fundamental matrix F = t x R. Any two solutions can be obtained from each other by a projective transformation in 
RP 3 . 

REMARK 2.2. Note that, although the configurations in correspondence are finite, their size is arbitrarily large, 
and the assumption of finite matching labelled pairs can be replaced by an assumption of parameterized sets in 
correspondence. Therefore, in absence of occlusions, a 3D configuration can be reconstructed from 2D images, and 
this reconstruction is unique up to a projective transformation. 
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2.5 Estimation of the fundamental matrix. 

Since equation ( 12. 2U is homogeneous as a linear equation in F, and F has rank two, this matrix depends on seven 
independent parameters. Therefore, in principle, F can be recovered from corresponding configurations of seven 
points. Due to the fact that the nature of digital imaging data is inherently discrete and errors occur also in landmark 
registration, F can be estimated using configurations of eight or more points p a ,i, a = 1, 2, i = 1, . . . k, k > 8, whose 
stacked homogeneous coordinates are the k x 3 matrices y a , a = 1,2. The linear system for F is 

(2.22) ylF Vl = 
and can be written as 

(2.23) fY = 0, 
where / is a vectorized form of F. 

A refined eight point algorithm for the estimate F of the fundamental matrix F can be found in Ma et al. (2006, p. 
188, p. 395). 

3 Projective Shape and 3D Reconstruction 

DEFINITION 3.1. Two configurations of points in M. m have the same projective shape if they differ by a projective 
transformation ofW 1 . 

Unlike similarities or affine transformations, projective transformations of R™ do not have a group structure under 
composition of maps( the domain of definition of the composition of two such maps is smaller than the maximal 
domain of a projective transformation in R m ). To avoid this complication, rather than considering the projective 
shapes of configurations in M m , we consider projective shapes of configurations in RP m . A projective shape of a 
k-ad ( configuration of k landmarks or labelled points ) is the orbit of that k-ad under projective transformations with 
respect to the diagonal action 

(3.1) afc(pi, ■ ■ ■ ,Pk) = (a(pi), (Pfc))- 
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Since the action ( 12.41 ) of j3 € PGL(m) on [x] G KLP" 1 , when expressed in inhomogeneous coordinates ( 12.21 ), reduces 
to ( 12.5b . if two configurations Ti, T2 of points in R m have the same projective shape, then h(Ti), h(T2) have the same 
projective shape in M.P m ( h is the affine embedding given by ( 12. IV ). 

Patrangenaru (1999, 2001) considered the set G(k, m) of k-ads (pi, ...,Pk), k > m+2, for which it = (pi, ...,p m+ 2) is 
a projective frame. PGL(m) acts simply transitively on G(fc, to) and the projective shape space -PSj^, is the quotient 
G(k, m)/PGL(m). Using the projective coordinates (p^ n+3 , . . . ,p%) given by ( 12.7b one can show that PS^ is a 
manifold diffeomorphic with (RP m ) fc ~ m ~ 2 . The projective frame representation has two useful features: firstly, the 
projective shape space has a manifold structure, thus allowing the use of the asymptotic theory for means on manifolds 
in Bhattacharya and Patrangenaru (2003, 2005). Secondly, it can be extended to infinite dimensional projective shape 
spaces, such as projective shapes of curves, as shown in Munk et al. (2007). This approach has the advantage of 
being inductive in the sense that each new landmark of a configuration adds an extra marginal axial coordinate, thus 
allowing to detect its overall contribution to the variability of the configuration, as well as the correlation with other 
landmarks. The effect of change of projective coordinates due to projective frame selection, can be understood via a 
group of projective transformations, but is beyond the scope of this paper. 

We return to the reconstruction of a spatial configuration. Having in view the definition [XT] of a projective shape of a 
configuration, Theorem l2.2l can be stated as follows: 

THEOREM 3.1. A spatial 1Z reconstruction of a 3D configuration C can be obtained in absence of occlusions from 
two of its ideal camera views. Any such 3D reconstruction 1Z ofC has the same projective shape as C. 

REMARK 3.1. Since the projective shape of the 3D reconstruction configuration from a pair of images is uniquely 
determined, and since multiplying by imposed internal camera parameters matrix keeps the projective shape of the 
reconstruction unchanged, one may also fix the internal camera parameters conveniently and estimate the essential 
matrix instead of the fundamental matrix. An eight point algorithm for estimation of the essential matrix is given in 
Ma et. al. (2004, p. 121), for given internal parameters. 

REMARK 3.2. Another approach to projective shape has been recently initiated by Kent and Mardia (2006, 2007). 
This approach has the advantage of being invariant with respect to the group of permutations of landmark indices, 
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however it involves a nonlinear approximation to the matrix solution of a data driven equation in an m x m matrix, 
and has not been yet applied in projective shape analysis for m > 1. 



4 Nonparametric Estimation and Testing for the Projective Shape a 3D Confi- 
guration 

Assume J : M — > R N is an embedding of the d dimensional complete manifold M. Bhattacharya and Patrangenaru 
(2003) denned the extrinsic mean [ij of a J— nonfocal random object ( r.o.) Y on M by 

(4.1) iij =: J-\Pj{n)), 

where \i = E( J(Y)) is the mean vector of J{Y) and Pj : T c — > J(M) is the ortho-projection on J(M) defined on the 
complement of the set T of focal points of J(M). The extrinsic covariance matrix of Y with respect to a local frame 
field y — > (/i(y), . . . , fd(y)), for which (dJ(fi(y)), . . . , dJ(fd(y))) are orthonormal vectors in R w , was defined in 
Bhattacharya and Patrangenaru (2005). If £ is the covariance matrix of JiY) (regarded as a random vector on R N ), 
then Pj is differentiable at /z. In order to evaluate the differential d^Pj one considers a special orthonormal frame 
field to ease the computations. A local ortho-frame field (ei(p), e2(p), . . . , ejv(p)) defined on an open neighborhood 
U C R N of Pj(M) is adapted to the embedding J if Vy G J' 1 ^), (e r (J(y)) = d y J(f r (y)), r = 1, . . . , <£ Let 
ei, e 2 , . . . , e^r be the canonical basis of R w and assume (ei(p), e 2 (p), . . . , euip)) is an adapted frame field around 
Pj(Ai) = Then T, E given by 



(4.2) 



^d M Pj(e 6 ) • e a (Pj(p))e a (P.,(p)) 
=i 

5^d„Pj(e 6 ) • e a (Pj^))e a {Pj{^) 



J 6=1,...,JV 
T 



a=l 



J 6=1,..., JV 

is the extrinsic covariance matrix of Y with respect to fd{^j))- The projective shape space PS^ is 

homeomorphic to M = (MP™) 9 , g = k — m — 2. RP m , as a particular case of a Grassmann manifold, is equivariantly 
embeddedin the space S(m+1) of (m+l)x (m+1) symmetric matrices ( Dimitric (1996)) via j : MP" 1 — > 
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given by 

(4.3) j([x}) = xx T . 

Patrangenaru (2001) and Mardia and Patrangenaru (2005) considered the resulting equivariant embedding of the pro- 
jective shape space 

J = 3k'- PZ k m = (R m ) 9 -» (S(m + 1))" 

defined by 

(4.4) jk{[xi], [a:,]) = (j[xi), ...,j[x g ]), 
where x s G R m+1 ,x^x s = l,Vs = 

REMARK 4.1. 77ze embedding jk in (14.41 1 yields the fastest known computational algorithms in projective shape 
analysis. Basic axial statistics related to Watson's method of moments such as the sample mean axis ( Watson(1983)) 
and extrinsic sample covariance matrix (Prenticef 1984)) can be expressed in terms ofj m +3 = j- 

A random projective shape Y of a fc-ad in KP m is given in axial representation by the multivariate random axes 

(4.5) (Y\...,Y q ),Y s = [X S },(X S ) T X S = l,Vs= 1, . . . , q = k - m - 2. 

From Bhattacharya and Patrangenaru (2003) or Mardia and Patrangenaru (2005) it follows that in this multivariate 
axial representation of projective shapes, the extrinsic mean projective shape of (Y 1 , . . . , Y q ) exists if Vs = 1, . . . , q, 
the largest eigenvalue of E(X S (X S ) T ) is simple. In this case p,j k is given by 

(4.6) fXj k = ([71 (m + 1)], [j q (m + 1)]) 

where X s (a) and j s (a),a = l,...,m+lare the eigenvalues in increasing order and the corresponding unit eigenvector 
ofE(X s (X s ) T ). 

If Y r , r = 1, . . . , n are i.i.d.r.o.'s from a population of projective shapes ( in its multi-axial representation), for which 
the mean shape p,j k exists, from a general consistency theorem for extrinsic means on manifolds in Bhattacharya and 
Patrangenaru (2003) it follows that the extrinsic sample mean [Y]j k is a strongly consistent estimator of p,j k . In the 
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multivariate axial representation, Y r is given by 

(4.7) Y r = ([X}], [*«]), {X») T X» = 1; a = 1, q. 
Let J s be the random symmetric matrix given by 

(4.8) J s = n- x ^ =l X s r {X s r ) T , s = l,...,q, 

and let d s (a) and g s (a) be the eigenvalues in increasing order and the corresponding unit eigenvector of J s , a = 
1, . . . , m + 1. Then the sample mean projective shape in its multi-axial representation is given by 

(4.9) Y jk<n = ([.gi(m + 1)], . . . , [g q (m + 1)]). 

REMARK 4.2. Some of the results in this section are given without a proof in Mardia and Patrangenaru (2005). For 
reasons presented in Remark \4.3\ we give full proofs of these results. 



To determine the extrinsic covariance matrix ( I4.2l i of ( 14.51 . we note that the vectors 
(4.10) f M - (0 ) ...,0, 7 .(a),0,...,0), 



with the only nonzero term in position s,s S l,q, a G l,m, yield a basis in the tangent space at the extrinsic 
mean T^ ik (KP m ) 9 , that is orthonormal with respect to the scalar product induced by the embedding jk- The vectors 



e (s,a),VsG l,g,Va€ 1, m, defined as follows: 

(4-11) e (s>0 ) =: d Nh jk(f( s ,a))- 

form an orthobasis of Tj k ^ 1 .^(JiLP m ) q . We complete this orthobasis to an orthobasis of q-tuples of matrices (e.i)i^i 
for (S(m+l)) q , that is indexed by the set X, the first indices of which are the pairs (s,a),s — 1, . . . , q; a = 1, . . . , m 
in their lexicographic order. Let E h a be the (m + 1) x (m + 1) matrix with all entries zero, except for an entry 1 in the 
position (a. b). The standard basis of S(m + 1) is given by e„ = E° a + E%, 1 < a < b < m + 1. For each s = 1, g 
, the vector 

(s e a) — (0tn+l7 •••) Om+l, e ai Om+l, m+ i) 
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has all the components zero matrices m +i € S(m + 1), except for the s-th component, which is the matrix e h a of the 
standard basis of S(m + 1, K); the vectors s e h al s = 1, . . . ,q,l < a < b < m + I listed in the lexicographic order of 
their indices (s, a, b) give a basis of S(m + l) q . 

Let £ be the covariance matrix of jkiY 1 , . . . , Y q ) regarded as a random vector in (S(m + l)) 9 , with respect to this 
standard basis, and let P =: P Jk : (S(m + l)) q j k ((RP m ) q ) be the projection on j k ((RP m ) q ). From (O it 
follows that the extrinsic covariance matrix of (Y 1 , . . . , Y q ) with respect to the basis ( 14.1 Oi l of T w (RP m ) q is given 
by 

Y.E = [e (S;0) (P( M )) • d,P{ r e b a )} (s=1) ... )9))(o=1) ... im) • E 

Assume Yj, . . . ,Y n are i.i.d.r.o.'s (independent identically distributed random objects) from a jfc-nonfocal probabi- 
lity measure on (M.P m ) q , and /x Jfc in ( I4.6l l is the extrinsic mean of Yi. We arrange the pairs of indices (s, a), s = 
1, . . . , q; a = 1, . . . , m, in their lexicographic order, and define the (mg) x (mq) symmetric matrix G n , with the 
entries 

Gn {s ,a),(ub) = n- l {d s {m + 1) - d s (a)) _1 (^(m + 1) - ^(fo))" 1 • 

n 

(4.13) • 5> s (a) T X r s )( fft (6) T ^)(. 9s (m + if 'X s r )(g t (m + lfX*). 

r=l 

LEMMA 4.1. G n is the extrinsic sample covariance matrix estimator ofY^E- 

Proof. The proof of lemma |4~T1 is based on the equivariance of the embedding ju. As a preliminary step note that 
the group SO(m + 1) acts as a group of isometries of M.P m . If R 6 SO(m + 1) and [a;] G MP" 1 then the action 
R([x}) = [Rx] is well defined. SO(m + 1) acts by isometries also on S + (m + 1,R) via R(A) = RAR T . Note that 
the map j(x) = xx T is equivariant since 

j{R[x\) = j([Rx}) = (Rx)(Rx) T = Rj([x])R T = R(j([x})). 

Therefore, for q > 1 the group (SO(m + l)) q acts as a group of isometries of (RP m ) q and also 

(4.14) ((R u ...,R q ) ■ (A l7 ...,A q ) = {R 1 A 1 Rl,...,R q A q R T q ),R J e SO(m + l),j = l,...,q. 
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The map jk is equivariant with respect to this action since 

3k{(Ri,-,Rq) ■ (Ni],->K])) = (Ri,-,Rq) •jk([zi],:;[x q \), 

(4.15) V(R 1 ,...,R q )e(SO(m + l))i,V({x 1 ],...,[x q })E(UP m )' 1 - 

We set M = ^((IF™) 5 ). Let M™ +1 be the set of all matrices of rank 1 in S+(m + 1). Note that M is the direct 
product of q copies of M™ + . Recall that 

(4.16) P : (S+(m+l,R)) q -> M 

is the projection on M. If Y r = ([X^:], . . . , [JQ?]),r = 1, ...,n, are i.i.d.r.o.'s from a probability distribution on 

(KP m )«, we set 

V r =Jk(Y r ). 

From the equivariance of jk, w.l.o.g. ( without loss of generality ) we may assume that jk(Y) = D = (D%, . . . , D q ) 
where D s £ S+(m + 1, M) is a diagonal matrix, s — 1, . . . , q. Therefore 

Y jk ,n = (\gi(m + l)],...,[g g (m + l)]) 

where Vs = 1, q, Va = 1 , . . . , to + 1, <? a (s) = e a are the eigenvectors of D s . 
It is obvious that if V is the sample mean of V r , r = 1, . . . , n, then 

(4-17) jk(Y jk , n ) = P(V) = P(D). 

Therefore w.l.o.g. we may assume that 

(4.18) g s (a) = e a , Vs = 1, . . . , q, Va = 1, . . . , m + 1, 

and that jk(p) = P(V) is given with p = ([e TO +i], . . . , [e m +i]). The tangent space T p (M.P m ) q can be identified with 
(l m )', and with this identification f( Sta ) in ( 14. 10b is given by /( s , a ) = (0, . . . , 0, e a , 0, . . . , 0) which has all vector 
components zero except for position s, which is the vector e a of the standard basis of K m . We may then assume that 
e (s,a)(D) '■= d p jk(ie s ). From a straightforward computation which can be found in Bhattacharya and Patrangenaru 
(2005) it follows that d D P( s e b a ) = , except for 

(4.19) d D P(( s e™ +1 )) = {d s (m + 1) - d s (a)}e M (P k (D)). 
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If Y r , r = 1, . . . , n is given by ( 14.71 ), from ( 14.191 ), ( 14.2b we obtain 

(4.20) (G„) (i>a) , - b) = n- x {<ii(m + 1) - d^y^d^m + 1) - c^)}- 1 £ 4 X r a ,X r b l X;»+ 1 j X r m+1 , 

r 

which is (14.13b expressed in the selected basis, thus proving the Lemma ■ 

The proof of Theorem l4. 1 l is elementary following from Lemma FTTl and from the observation that V\ has a multivariate 
distribution with a finite covariance matrix £ since (RP m ) q is compact. For n large enough, V has approximately a 
multivariate normal distribution A/"(/z, — S) and from the delta method ( Fergusson 1996, p. 45 ), it follows that 

(4.21) P(V) ~ M(P(p) = i fe (Mfc), -d^PXd^). 

n 

The range of the differential d^P is a subspace of Tp^jk((M.P m ) q ), therefore the asymptotic distribution of P(V) 
is degenerate. If we decompose S(m + l)) q = T PiM) j k ((M.P m ) q ) © T P ^j k ((M.P m ) q ) 1 - into tangent and normal 
subspaces, then the covariance matrix of the tangential marginal distribution of tanP(V) is ^£e, which is nonde- 
generate because the generalized extrinsic covariance is given by the determinant det(T,E) — n' =1 A s (a), which is 
positive. Because V is a strongly consistent estimator of fj,, and S n is a strongly consistent estimator of £, from 
Slutsky's theorems (Fergusson, 1996, p.42) it follows that G n in ( 14.13b is a strongly consistent estimator of Sg. 
Let U = [( S U\, ..., s C/ m ) s= i i ... ! g] T be the random vector whose components are the components of tanP(V) w.r.t. 
the basis ers, a)(D) which is given in the proof of Lemma |4~T1 Since G n is a consistent estimator of it fol- 
lows that Z n = \/nG n 2 U converges to a 7V(0, I mq ) - distributed random vector, and Z^Z n converges to a random 
variable with a chi-square distribution with mq degrees of freedom. If one uses the equivariance again, one gets 
Z^Z n = T(Yj k<n ; fi) in ( 14.23b . which completes the proof of Theorem l4.1I B 
In preparation for an asymptotic distribution of Yj htTl we set 

(4.22) D s = (g a (l),...,g a (m))eM(m+l,m;m),s=l,...,q. 

If \i = ([71], ■ • ■ , [jq]), where 7 S e R m+1 , 7j7 s = 1, for s = 1, . . . , q, we define a Hotelling's T 2 type-statistic 
(4-23) T(Y jk>n ;ri = n{^D u . . .,^D q )G-\^D u . . .,^D q ) T , 
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THEOREM 4.1. Assume (5^-)r=i,...,n are i.i.d.r.o.'s on (RP™ 1 ) 9 , and Y\ is jk-nonfocal. Let A s (a) andj s (a) be 
the eigenvalues in increasing order, respectively the corresponding unit eigenvectors of E[X^(X^) T ]. If A s (l) > 
0, for s = 1, . . . , q, then T(Yj h>n ; fij k ) converges weakly to a Xmq distributed random variable. 

If Yi is a jfe-nonfocal population on (RP m ) q , since (M.P m ) q is compact, it follows that jk (Yi ) has finite moments 
of sufficiently high order. According to Bhattacharya and Ghosh (1978), this, along with an assumption of a nonzero 
absolutely continuous component, suffices to ensure an Edgeworth expansion up to order 0(n~ 2 ) of the pivotal statistic 
T(Yj ktTl ; fJ>j k ), and implicitly the bootstrap approximation of this statistic. 

COROLLARY 4.1. Let Y r = ([A"*], . . . , [A?]), Xj t X st = 1, s = 1, . . . q, r = 1, . . . , n, be i.i.d.r.o.'s from a jk- 
nonfocal distribution on (]RP m ) 9 which has a nonzero absolutely continuous component, and with Yie > 0. For a 
random resample with repetition (Y* , Y*) from (Yi, . . . ,Y n ), consider the eigenvalues d* s (a), a = 1, . ..,m + 1 
°f n E"=i X* S X*J in their increasing order, and the corresponding unit eigenvectors g* s {a) : a = 1, . . . ,m + 1. Let 
G* be the matrix obtained from G n , by substituting all the entries with ^—entries. Then the bootstrap distribution 
function of the statistic 

T(Y* jk ;Y jk ) = n( 9l (m + l)D{, . . .,g q (m + l)D* q ) G*' 1 
(4.24) ( 5l (m + l)Dl, . . .,g q (m + l)D* q ) T 

approximates the true distribution ofT([Yj k ; /ijj) given by ( I4.23l l, with an error of order p (n~ 2 ). 

REMARK 4.3. The above corollary is from Mardia and Patrangenaru (2005). Formula (14.24t in that paper has 
unnecessary asterisks for g s (to + 1 ) , a typo that is corrected here. Also the condition Tie > is missing there, as well 
as in their Theorem 5.1. Another typo in Mardia and Patrangenaru (2005) is in their definition of D s : the last column 
of D s should not be there. The correct formula is (14.221 l. Note that D s — (D s \g s (m + 1)). 

Theorem 14. II and Corollary 14. 1 1 are useful in estimation and testing for mean projective shapes. We may derive 
from (14. U the following large sample confidence region for an extrinsic mean projective shape 

COROLLARY 4.2. Assume (i / r ) r= i ! ... in are i.i.d.r.o.'s from a j^—nonfocal probability distribution on (RP m ) 9 , and 
Y>e > 0. An asymptotic (1 — a)-confidence region for [ij k = [v] is given by R a (Y) = {[v] : T(Yj kt7l ; [u]) < Xmq,a}> 
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where T(jYj k , \v\) is given in ( |4.23l l. If the probability measure of Y\ has a nonzero-absolutely continuous component 
w.r.t. the volume measure on (RP m ) q , then the coverage error of R a (Y) is of order Op(n _1 ). 

For small samples the coverage error could be quite large, and the bootstrap analogue in Corollary |4.1l is preferable. 
Consider for example the one sample testing problem for mean projective shapes: 

(4.25) H : n jk = (j,o vs. Hi : fi jh ^ fj, . 

COROLLARY 4.3. The large sample p -value for the testing problem ( 14.251 1 is p = Pr(T > T(Yj k _ n ; Ho)), where 
T{Yj k , n ]li) is given by ( 14.231 ). 

In the small sample case, problem ( 14.251 ) can be answered based on Corollary |4.1l to obtain the following 100(1 — 
a)% bootstrap confidence region for pij k : 

COROLLARY 4.4. Under the hypotheses of Corollar\ \4.1\ The corresponding 100(1 — a)% confidence region for 
fj, jk is 

(4-26) C* ta :=j^(Ul a ) 
with U* a given by 

(4.27) U: ia = {/i G M(RP m ) q ) ■ T(V jh ,n,n) < c!-J, 

where c*_ Q is the upper 100(1 — a)% point of the values of T(Y j k ;Y j k ) given by ( 14. 241 ). The region given by 
( I4.26b -( l4.27b has coverage error Op{n~ 2 ). 

If Tie is singular and all the marginal axial distributions have positive definite extrinsic covariance matrices, one 
may use simultaneous confidence ellipsoids to estimate /j,j k . Assume {Y r ) r= i. , n are i.i.d.r.o.'s from a jk— nonfocal 
probability distribution on (M.P m ) q . For each s = 1, . . . , q let E s be the extrinsic covariance matrix of Yf, and let 
Y S j n and G Sin be the extrinsic sample mean and the extrinsic sample covariance matrix of the s-th marginal axial. If 
the probability measure of Y* has a nonzero-absolutely continuous component w.r.t. the volume measure on (RP m ), 
and if for s = 1, . . . , q and for [^ s ] € RP m , jf^ s — 1, we consider the statistics: 

(4.28) T s = T s (F* n , [ 7s ] ) = ni jD s G-} n D T sls 

20 



and the corresponding bootstrap distributions 

(4.29) T* s = T a (Y** n , ;F;„) = ng s (m + 1) T D*G* 8>n ~ l D* T g g (m + 1). 

Since by Corollary 14. li r„ has asymptotically a \m distribution, we obtain the following 

COROLLARY 4.5. For s = 1, . . . , q let c* sl _ a be the upper 100(1 - a)% point of the values ofT* given by J4.29I ). 
We set 

(4-30) c^^j^iu;^) 

with U* n g given by 

(431) Ul ntf} = G j(RP m ) : T s (yl n ; p) < c^}. 

If 

(4-32) K ta = ^UiC* s>n , v 

with C* n p, U* n p given by J4.30l )-( |4.31| ), then i?* a is a region of at least 100(1 — a)% confidence for fij k . The 
coverage error is of order Op{n~ 2 ). 

REMARK 4.4. If Tie is singular, one may also use a method for constructing nonpivotal bootstrap confidence regions 
for fj,j h using Corollary 5.1 of Bhattacharya and Patrangenaru (2003). 
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